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Abstract: 

Modern business and technical decisions are based on using the results of analyses, which are only as 
good as the data used. When considering "reliability data", the concern is how long a system will 
continue to operate as designed. Ideally, a large set of pass/fail tests or observations to estimate the 
probability of failure of the item under test would produce the best data. However, this would be a 
costly endeavor if used for every analysis and every design. 

Developing specific data is costly and time consuming. Instead, analysts rely on available data to assess 
reliability. Finding data relevant to the specific use and environment for any project is difficult, if not 
impossible. Instead, we attempt to develop the "best" or composite analog data to support our 
assessments. 

One method used incorporates processes for reviewing existing data sources and identifying the 
available information based on similar equipment, then using that generic data to derive an analog 
composite. Dissimilarities in equipment descriptions, environment of intended use, quality and even 
failure modes impact the "best" data incorporated in an analog composite. Once developed, it can be 
used to support early trades, models to establish the predicted reliability data points. Those data points 
may be used as a prior and updated based on observations during development, test and operations. 

The better and more project specific the data, the more accurate the analysis and hopefully the better 
the final decision. 

Introduction: 

Data is information in a raw or unorganized form that represents facts, condition or ideas. In the 
reliability arena it is usually in the form of facts or statistics that can be analyzed and used to provide 
addition information for system operations or improvement. Interpreting raw data and manipulating it 
into a form which provides the most useful input into the analysis is crucial for deriving the most 
realistic and effective analysis. 

For many computing and data processing models, the data is often represented in a tabular structure 
(rows and columns), a tree (nodes with parent-child relationship), or a graph (connected nodes). Data 
can be the result of observations, measurements or research. Generally, large sets of data are more 
easily visualized using graphs and charts. 

Raw or unprocessed data refers to a collection of numbers or characters. Data processing commonly 
occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the 
next. Field data refers to raw data that is collected without rigorous procedural control in a normal 
working environment as part of operational measurements, verification of meeting specifications or 
data reporting requirements. Experimental data is generated as part of a specific study with a defined 
purpose, and recorded using rigorous processes and procedures within the context of a scientific study. 
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Knowing how the data was generated can assist the data analyst if questions arise regarding differences 
in values for the same or similar components. The general preference is to locate data that is similar to 
the specific project or system under review. This is more likely to be more representative of observed 
operations providing more accurate analysis, and a better decision. 

The topics discussed in this paper include: Where do we find data, source selection, taxonomies, data 
development protocols, development of generic datasets, using generic data composites to influence 
design and operations and deriving a composite generic failure rate supporting the quantification of 
early design risk and reliability models. Once we have a number, we need to address is it "good 
enough". 

Finding the Data: 

Reliability performance data is critical to quantifying a system or product risk, identifying the reliability 
of systems and equipment, defining cost effective maintenance cycles, and providing accurate 
information for logistics support. Identifying data sources is one of the challenges the analyst faces. 

There are a number of common complaints about data. When asked, most people will state that data 
sources do not exist for their specific needs, systems, components or whatever it is they are looking at. 

If they are aware that something exists, the assertion is that it is not relevant to their needs, or the data 
is unusable because it is not an exact fit with their component or system or environment. Although this 
is sometimes true, the arguments are based on biased assumptions. Often there is useful information 
available. The analyst needs to seek out potential sources. 

Federal agencies, national and international consortia, industry groups and commercial entities develop, 
publish and maintain risk and reliability data based on observed operations, test, failures, warranty, and 
expected life. 

Reliability data and associated technical reports are generated by and in support of numerous federal 
agencies. A great deal of the information is available for public access. Legal considerations and 
constraints preclude full access to all federal agency datasets and reports. Information of a sensitive 
nature or with limited distributions normally requires special access. Those that are available for public 
distribution may be located using internet search capabilities or may be purchased from the agency 
technical publications libraries. Several federal agency web sites include technical report libraries. One 
source of government funded studies is maintained by the Defense Technical Information Center (DTIC). 

National laboratories are also a good source to find available data. Most work performed by the 
laboratories is in support of federal agencies. The generated reports are often available from supported 
agency technical report libraries with unlimited or public access distribution. A number of the national 
laboratories provide access to their technical report libraries. 
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An alternative to searching various federal agency technical libraries is to access the National Technical 
Reports Library under the National Technical Information Service run by the Department of Commerce. 
They provide an extensive library of published historic government reports. 

In addition to federal agencies, national and international professional organizations develop and 
maintain technical documents, reports and datasets. Criteria to access those reports differ between 
organizations. 

Finally, commercial activities provide access to datasets, technical reports, and applications that 
document reliability performance parameters. 

Examples of the data sources and the types of documents available are shown below. 


Data Sources 



Published Data Sources 

European Industry Reliability Data - EiReDA 
Failure Rate Data In Perspective - FARADiP 

Component Reliability Data for use in Probabilistic Safety Assessment - IAEA 
TECDOC-478 

Generic Component Reliability Data for Research Reactor PSA - IAEA- 
TECDOC-930 

Centralized Reliability and Events Database - ZEDB 
Risk Assessment Data Directory - OGP 434-A1 

Generic Component Failure Data Base for Light Water and Liquid Sodium 
Reactor PRA - EGG-SSRE-8875 


Generic Data 


Web Based Data Sources 

Industry Average Parameter Estimates, U.S. NRC 

Failure Rates, llity #Engineering 

Weibull Database, Barringer & Associates, Inc 




Commercial Data Sets 

Reliability Automated Databook - RIAC RAD (NPRD,EPRD,FMD) 
System and Part Integrated Data Resource - SPIDR 
Offshore Reliability Data Handbook -OREDA 

Reliability Data for Control and Safety Systems - PDS Data Handbook 
Safety Equipment Reliability Handbook - PDS Data Handbook 
Reliability of Well Completion Equipment - Wellmaster 
Subsea Reliability Data - SubseaMaster 



Industry and Vendor 
Technical Reports 

Guidelines for Process Equipment Reliability Data with Data Tables, Center for 
Chemical Process Safety 

Design of Reliable Industrial and Commercial Power Systems - IEEE STD 493-2007 
IEEE Guide to the Collection and Presentation of Electrical, Electronic, Sensing 
Component, and Mechanical Equipment Reliability Data for Nuclear-Power 
Generating Stations - IEEE STD 500 

Historic Reliability Data for IEEE 3006 Standards: Power System Reliability - IEEE 3006 
Fairchild Semiconductor Reliability Report 

A Summary and Assessment of Historical Reliability and Maintainability Data for 
Active Solar Hot Water and Space Conditioning Systems - SERI TR-253-2120 


Industry Data 



Failure reporting and 
corrective actions 

Dependability Management - Part 3-2. Application Guide - Collection of dependability data 
from the field, IEC 60300-3-2 

Nuclear Power Plants - Reliability Data Exchange - General Guidelines, ISO 6527 
Nuclear Power Plants - Guidelines to assure quality of collected data on Reliability, ISO 7385 
Petroleum, petrochemical and natural gas industries Collection and exchange of reliability and 
maintenance data for equipment, ISO 14224 

Petroleum, petrochemical and natural gas industries — Production assurance and reliability 
management, ISO 20815 

Collection and Exchange of Reliability and Maintenance Data for Equipment, API STD 689 
Performance-Based Failure Reporting, Analysis & Corrective Action System (FRACAS) 
Requirements, AIAA S-102.1.4 

Standard Classification for Hierarchy of Equipment Identifiers and Boundaries for Reliability, 
Availability, and Maintainability (RAM) Performance Data Exchange, ASTM F2446 
Guide to the Collection and Presentation of Electrical, Electronic, Sensing Component, and 
Mechanical Equipment Reliability Data for Nuclear-Power Generating Stations, IEEE 500 
Recommended Practice for Reporting Field Failure Data for Power Circuit Breakers, IEEE 1325 


Predictive Methodologies 



Military Handbooks 

Reliability Prediction of Electronic Equipment, 
Mil-Hdk-217 

Handbook of Reliability Prediction Procedures for 
Mechanical Equipment, MechRel 


National Consortia Standards 

Failure Rate Estimating, GEIA SSB-1.004 
American National Standard for Reliability 
Prediction, ANSI/VITA 51.0-2008 
Guide for Selecting and Using Reliability 
Predictions Based on IEEE 1413, IEEE 1413.1 


International Standards 

Reliability Data Handbook-Universal model for reliability prediction 
of electronics components, PCBs and equipment, IEC TR-62380 
Reliability - Reference conditions for failure rates and stress models 
for conversion, IEC-61709 

Reliability Data Handbook, RDF2000 (UTE C 80-800) 

Reliability Methodology for Electronic Systems, FIDES Guide 
Reliability Prediction Model for Electronic Equipment, GJB/z 299B 


D 

Commercial Practice 

Reliability Prediction Procedure for Electronic 
Equipment, Telcordia SR-332 
Next Generation Reliability Prediction, 217Plus 
PRISM System Reliability Assessment Software Tool 
Reliability and Maintainability Predictions, Frontis 
Corp 


A general knowledge of industry data sources is a central key for data collection and aggregation efforts. 
Canvasing different industries and technical organizations provides a wealth of information. Individuals, 
companies, research organizations, technical organizations and industry wide efforts have all identified 
and collected data. Examples of these sources include the Center for Chemical Process Safety (CCPS) a 
function of the American Institute of Chemical Engineers (AiCHE) and their "Industry Process Equipment 
Reliability Database", the Institute of Electrical and Electronics Engineers (IEEE) "Recommended Practice 
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for the Design of Reliable Industrial and Commercial Power Systems" and the Oil and Gas industry's " 
Offshore Reliability Data Handbook" . Other alternative published data sources include the Department 
of Defense (DOD) Reliability Information Analysis Center and their published Databooks as well as 
published European datasets like Failure Rate Data in Perspective (FARADIP) and the European Industry 
Reliability Data Bank (EIReDA). 

Empirical data, often centered on observation or experience, may be based on historical facts or raw 
field data. Data sources might include test documents and field data of electronic, electro-mechanical, 
and mechanical systems, assemblies, and parts. Field data is often based on identical, similar or 
equivalent items. Sources for empirical data include the commercial compendiums as well as industry 
technical reports. Details from maintenance records are also a source of observed empirical data. The 
value of the detail is predicated on robust maintenance data collection efforts. Best practices and 
standards pertinent to field data collection are available and should comprise core protocols as part of 
and organizations failure reporting and corrective action system (FRACAS). 

Other data sources include predictive methodologies; using mathematical models to make predictions 
about inherent reliability. Although the results tend to be conservative, these methodologies are often 
used for new technologies as well as continued use in various commercial reliability analysis 
applications. These models include both the national and international standards as well as models 
incorporated into various reliability program applications. 

Another alternative is to address reliability is the use of physics of failure models. This approach to 
reliability uses models and simulations to design-in reliability by understanding system performance, 
reduce decision risk during design and improve equipment reliability in the field. The simulation 
includes modeling the root causes of failure such as environmental factors, wear and material 
characteristics. If physics-based models are to be employed information needed includes: defect rates, 
material properties (e.g., functional characteristics), defect (flaw) distributions, material variation 
quantification (e.g., purity, yields, dimensions). 

Source Selection: 

In general, it is best to use data sources pertinent to the specific industry, since these will be similar in 
specifications, design requirements, environmental conditions and operational characteristics. 
Corporate, industry and various standards consortia should influence any developed protocols that will 
be used to support the data source selection. Selection criteria should consider: 

• Data validity 

• Suitability 

• Maintenance of Data 

• Recognition of data source 

• Usability 

• Cost 
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Ultimately source selection will be based on availability of source datasets, the viability of the reported 
data and its applicability to the intended use. All of which will be influenced by corporate policies and 
professional best practice. 

Process Flow: 

A critical effort for developing internal data processing protocols is defining the documentation needed 
for each decision. These decision points will have an impact on how data is captured, processed, and 
reported. The example data process shown below is modified from the CCPS process flows and defined 
in their text "CCPS Guidelines for Chemical Process Quantified Risk Analysis", 1996. 
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Data Development: 

The development of composite analog data requires the identification of the dataset taxonomy. This 
taxonomy, or method of classification identified, is influenced by the data analysis protocols used to 
derive an aggregated reliability performance parameter based on similar equipment types and failure 
modes from multiple data sources. 

The data collection and reporting taxonomy should be clearly defined early in the data collection 
process. Industry taxonomy best practice is a good resource for developing the look and feel of the 
collection record based on indentured levels. Setting up this taxonomy early, will save time and effort 
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by precluding the need to revisit multiple sources in order to capture details that were missed or 
overlooked. The indenture level and boundary should be clear to both the developer and the user. 
Example sources available to assist in defining taxonomies include the Center for Chemical Process 
Safety's Process Equipment Reliability Database (PERD) Taxonomies and the International Organization 
for Standardization (ISO) Standard 14224: Petroleum, petrochemical and natural gas industries - 
Collection and exchange of reliability and maintenance data for equipment. 

Data development protocols are the corporate or industry best practice describing how composite data 
records will be collected, aggregated and reported. As a minimum, it includes the data required, and 
how the details will be combined or aggregated to develop the resulting generic performance 
parameter. Considerations should include: 

• Limited observations (operating time, number of demands or cycles) 

• Zero failures (estimate performance parameter) 

• Use of adjustment factors (referred to as Pi-factors or logistic performance factors) 

• Environmental conditions of observed failures 

• Quality level of the part or item under study 

• Duplicate records 

• Minimum number of records needed 

• Rationale for inclusion and exclusion of records 

• Confidence level or uncertainty bounds 

• Use of models to derive performance measures: 

• Aggregation of failures and observed time, cycles or miles 

• Wiebull analysis 

• Bayesian analysis 

• Use of logistic performance parameters (Pi-factors) to convert quality and environment factors 
to a common known level. 

Capturing observed operational hours and failure data from multiple sources allows the development of 
industry and generic composite failure rates. Inclusion of industry protocols should be clearly defined, to 
address the variance of environment and quality as well as record exclusions necessary to provide 
requisite refinement of the composite failure rate that will reflect an expected quality level and 
environment. 

The development of composite analog data will only be as good as the consistent implementation of the 
data processing protocols. 

Data Composites and Design: 

Generic data aggregation provides an initial predictive reliability estimation of equipment types, 
assemblies, parts or components. The data processing protocols provide the means to derive and refine 
the predictive measures. Dataset architecture (planning, designing, and constructing how to format the 
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datasets) controls the level of detail based on the data taxonomy and industry or developed protocols. 
Good data collection is critical to the effective process for utilizing data. The dataset architecture needs 
to include: 

• Equipment type identification 

• Failure statistics (i.e., failure rate, observed operational time, and failures) 

• Application information (i.e., environment, quality level) 

• Failure modes and distributions 

• Additional data is needed when a composite failure rate is to be applied to a model or 
incorporated into a Project specific data set. Elements needed would be: 

• System information (Parts breakout or master equipment list 

• Number of systems 

• Dates fielded for each system 

• Location of operation 

• Unique identifier for each system 

• Environment of intended use 

Data acquired from tests and field surveillance should be used to update the generic data. Field data is 
probably the most valuable type of data for this purpose since it represents the actual product or system 
in the intended use environment. 

Validating the aggregate 

Reliability performance measurements are not easy to find. However, data and useful details for 
analyzing the data do exist. Industry datasets, technical documents and maintenance records are a 
viable source to develop quantitative reliability factors. With data source information documented and 
the data appropriately developed, their use in quantitative models produces effective measurements of 
risk. These risk and reliability models also support the quantitative logistics parameters associated with 
maintenance planning, spares acquisitions and provisioning. 

"Any Number - Is not good enough" 

When proposing to use a number to quantify a risk or reliability model from alternative industry or 
commercial sources, we need to determine if the number "makes sense". Before using "just any 
number", compare available parameters documented in multiple sources to provide correlation of 
reported data as an aid in determining if the recommended parameters are "In-family" or "Out of 
expected range". 

The details used to quantify risk and reliability models include the failure rate, based on time or cycles, 
and error factors or uncertainty, used to generate the upper and lower failure rate constraints at a 
predetermined percentile. 
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Comparing the reported performance measures from multiple data sources allows the determination of 
"in-family" or "out of expected range". By creating the data records used in the data comparison, the 
details can be used to validate the predictive or aggregated values. The comparison supports the data 
selection process used in the quantification of risk and reliability models. It supports conclusions and 
recommendations when the model output or result does not meet reviewer or decision maker 
expectations. The reliability measures needed for the comparison should include: 

• Time related failure rate 

• Demand related failure rate 

• Error factor 

• 5th Percentile 

• 95th Percentile 

• Upper Bound 

• Lower Bound 

Not all industries use the same models and distribution parameters, knowing how the reported 
parameters are calculated is important if the data represented must be modified or normalized as part 
of the comparison process. 

Comparing multiple sources 

Implementing our data collection and data processing protocols allows the generation of worksheets 
showing the level of detail available from each source. Consistency in implementing protocols to include 
all conversions to normalize details across multiple originating sources allows the analyst to compare 
the results as apples to apples. Data sources from various industries will have different operating 
conditions, quality levels, and different processes to identify and assess performance. The protocols 
should address how the variances are to be handled. 

Comparing the reported data from multiple sources provides insight and validity to the model when 
those results are used to quantify risk and reliability models. An example that shows the variance noted 
between the various data sources is easily seen when comparing details for Valves. Different industries, 
and even different reports within an industry, use different formats and report results differently. 


ALTERNATIVE FAILURE RATE DATA: 




Dataset 1: 






X 

Variance 

( EF 

Source 

Equivalent Component i 

Comment 

2.76E-05 


5.70 

Summary composite Reliability Analysis Information Center's Reliability Auto 



Dataset 2: 






X 

Variance 

EF 

Source 

Equivalent Component 

Comment 

4.66E-08 

1.80E-14 

11.67 

Summary composite based on total hours and total failures 

Valve, Manual 

Observed operational time: 100961448; Observed Failures: 47 

2.62E-07 


1.34 


Valve,Manual, External LeakSmall 

Observed operational time: 100961448; Observed Failures:26 

1.34E-07 


1.49 


Valve, Manual, Internal LeakSmall 

Observed operational time: 100961448; Observed Failures: 13 

8.42E-08 


1.63 


Valve.Manual, Spurious Operation 

Observed operational time: 100961448; Observed Failures: 8 

Dataset 3: 






X 

Variance 

EF 

Source 

Equivalent Component 

Comment 

1.03E-06 

3.63E-06 

6.18 

Idaho Chemical Processing Plant Failure rate database, INEL-95/0422 

Valve, All 


2.52E-07 

1.79E-06 

12.80 


Valve, Leak 


6.58E-08 

9.17E-07 

28.22 


Valve, Plug 


7.53E-07 

3.10E-06 

7.24 


Valve, Other 


Dataset 4: 






X 

Variance 

EF 

Source 

Equivalent Component 

Comment 




Generic Component Reliability Data for Research Reactor PSA, IAEA TECDOC- 

Valve, manual. Failure to function 

Uper Control Limit: 1.0E-06 Lower Control Limit: 

4.60E-06 




Valve, manual. Degraded 

Uper Control Limit: 6.2E-06 Lower Control Limit: 7.0E-07 

5.50E-06 




Valve, manual. Degraded 

Uper Control Limit: 1.31E-05 Lower Control Limit: 1.10E-06 

1.35E-05 




Valve, manual. Failure to function 

Uper Control Limit: 3.2E-05 Lower Control Limit: 2.40E-06 

1.35E-05 




Valve, manual. Leakage 

Uper Control Limit: 3.2E-05 Lower Control Limit: 2.40E-06 
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Using a summary worksheet shows the reported failure rate and allows for the calculation of upper and 
lower bounds. This also shows the specified or assumed quality level and environment. 


Component Type Name: 


Failure rate per million hours 


Dataset Source 

Original 
Qlty Level 

Original 

Environment 

Mean 

95% 

5% 

EF 

Reliability Automated Databook (RAD) 

Military 

Aviation 

2.76E-05 

1.57E-04 

4.85E-06 

5.70 

System and Part Integrated Data Resource (SPIDR) 







Offshore Reliability Data Handbook (OREDA) 







Safety Equipment Reliability Handbook 







European Industry Reliability Data (EIReDA) 







Process Equipment Reliability Data (PERD) 







Weibull Failure Database 







Failure Rate Data in Perspective (FARADIP) 







NRC-SPAR, Industry Average Parameter Estimates 

Commercial 

Ground Fixed 

4.66E-08 

5.43E-07 

3.99E-09 

11.67 

National Consortia:Technical Reports (Compiled) 







National Laboratory: Technical Reports (Compiled) 

Commercial 

Ground Fixed 

1.03E-06 

6.37E-06 

1.67E-07 

6.18 

International Consortia: Technical Reports 
(Compiled) 

Commercial 

Ground Fixed 

3.00E-07 

3.00E-06 

3.00E-08 

10.00 

Commercial Vendor: Technical Reports (Compiled) 







Generic Riskand Reliability 







Selected Failure Rate 








The original failure rate, as processed using our data protocols, are captured and then used to calculate 
the upper and lower bounds based on the known or assumed error factor. The resulting chart is a 
graphic depiction that allows the reliability analyst to consider values to be used to quantify risk and 
reliability models. 
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Failure rate per million hours 
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By comparing the reliability records of multiple data sources, the analyst can evaluate the applicability 
of specific failure rate data to quantify the risk and reliability models. Comparing the numbers provides 
a basis for decisions to use or discard proposed failure data. Acceptance/rejection rationale can be 
developed supporting the use of specific data sources and failure rates used to quantify the model or 
verify the range of data submitted by a vendor or contractor. 

When responding to critics, it is evident the numbers used are "In-family" with reported data reflecting 
operational conditions. 

Summary and conclusions: 

The reliability analyst is often tasked to determine the probability of failure or success of a system based 
on a new or incomplete design using state of the art equipment with little or no failure history. To 
accomplish that task requires planning and effort. Modeling the system is one part of the solution, but 
to quantify the model requires data. Developing data that is supportable, traceable, documented, and 
makes sense takes time and effort. And, someone will disagree with the results. By using available data 
sets and comparing the observed operational reliability the analyst can determine if the recommended 
performance parameters used to quantify the model make sense. Establishing the means and methods 
to derive and validate data supports the conclusions and recommendations reported to decision 
makers. 
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